ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread#12558
ARROW-15841: [R] Implement SafeCallIntoR to safely call the R API from another thread#12558paleolimbot wants to merge 28 commits into
Conversation
|
|
|
(See also westonpace#10) |
|
Redoing this with an eye towards where I would actually like to use it! I think that it does need a synchronous The places where I would prefer to use this in some other PRs:
Some sketch examples: arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "on_main_thread"
)
#> [1] "string one!"
arrow:::TestSafeCallIntoR(
function() stop("This is an error"),
opt = "on_main_thread"
)
#> Error in (function () : This is an error
arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "async_with_executor"
)
#> [1] "string one!"
# This runs with the expected error, but causes subsequent segfaults, probably related
# to the error_token_ (maybe having to do with the copy-constructor?)
# arrow:::TestSafeCallIntoR(
# function() stop("This is an error"),
# opt = "async_with_executor"
# )
arrow:::TestSafeCallIntoR(
function() "string one!",
opt = "async_without_executor"
)
#> Error: NotImplemented: Call to R from a non-R thread without an event loop |
westonpace
left a comment
There was a problem hiding this comment.
This will be an awesome capability. A few nits and thoughts but overall I think this is the right direction.
| // [[arrow::export]] | ||
| std::string TestSafeCallIntoR(cpp11::function r_fun_that_returns_a_string, |
There was a problem hiding this comment.
We don't have a precedent for this in the Arrow R package (a place to test C++ code from C++ that is hard to test from R). We probably don't want something like this running on CRAN, but I'm not sure what the best way is to fence this off / keep it from compiling anywhere except CI?
There was a problem hiding this comment.
I haven't dug in too much too the code yet, but is this resolved with new commits, or do we still need to find a way to gate this?
There was a problem hiding this comment.
Neal took a quick look and said it it's fine as long as there's a note as to where TestSafeCallIntoR is defined (there's some Altrep tests that do this, too)
westonpace
left a comment
There was a problem hiding this comment.
Very clean and easier to understand now. Thanks for figuring this out.
| .onLoad <- function(...) { | ||
| if (arrow_available()) { | ||
| # Make sure C++ knows on which thread it is safe to call the R API | ||
| InitializeMainRThread() |
There was a problem hiding this comment.
Do we know for a fact that the R thread never changes? For example, in JS, there is always "one thread" but the actual thread id can change from iteration to iteration of the event loop.
There was a problem hiding this comment.
I asked in the r-lib slack channel and nobody seems to feel that this will be a problem. They did advise to check parallel::mclapply() since this creates a fork of the process, but a check seems to indicate that the value of std::this_thread::get_id() seems to be stable if somebody does happen to do that:
cpp11::cpp_source(code = '
#include "cpp11.hpp"
#include <thread>
#include <sstream>
[[cpp11::register]]
std::string thread_id() {
std::thread::id id = std::this_thread::get_id();
std::stringstream ss;
ss << id;
return ss.str();
}
')
thread_id()
#> [1] "0x100e33d40"
unique(lapply(1:1e3, function(x) thread_id()))
#> [[1]]
#> [1] "0x100e33d40"
unique(parallel::mclapply(1:1e3, function(x) thread_id(), mc.cores = 8))
#> [[1]]
#> [1] "0x100e33d40"| }); | ||
|
|
||
| thread_ptr->join(); | ||
| delete thread_ptr; |
There was a problem hiding this comment.
So this is probably fine but you could wrap thread_ptr in a unique_ptr. For example:
thread_ptr = std::unique_ptr<std::thread>(new std::thread(...));
It gets rid of the delete call and guards you against very unlikely things like ->join() throwing an exception and the memory never getting cleaned up (not that such a thing would really matter in test code).
There was a problem hiding this comment.
I can't get this to work without a crash!
There was a problem hiding this comment.
Odd. If you want to create a commit (doesn't have to be part of any PR) then I'd be happy to take a look and see what was going on. Otherwise, like I said, it isn't very important, so let's not worry too much about it.
| # under the License. | ||
|
|
||
| # Note that TestSafeCallIntoR is defined in safe-call-into-r-impl.cpp | ||
|
|
There was a problem hiding this comment.
| skip_on_cran() |
Is this sufficient to make sure we don't test this on cran?
There was a problem hiding this comment.
(I added them inside the test_that() blocks as mostly a stylistic choice!)
jonkeane
left a comment
There was a problem hiding this comment.
I'll merge when CI is green. Thank you!
|
Benchmark runs are scheduled for baseline = 76d064c and contender = e110eac. e110eac is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This is a very WIP draft that currently just sketches a few things related to calling into R from other threads. Some code to get started: